Improvements of the Philips 2000 Taiwan Mandarin benchmark system

نویسندگان

  • Yuan-Fu Liao
  • Nick Jui-Chang Wang
  • Max Huang
  • Hank Huang
  • Frank Seide
چکیده

In this paper, we present the Philips large vocabulary continuous Mandarin speech recognition system developed for the 2000 Taiwan Speech Input Technology Assessment. We systematically integrated key Mandarin components with up-todate Western-language techniques to build up a state-of-the-art Mandarin speech recognition system. These technologies include robust pitch extraction/tone modeling, context-dependent preme/core-final units, Chinese phrase/syllable trigram language model, linear discriminant analysis (LDA), cross-syllable modeling/decoding, speaker clustering and maximum likelihood linear regression (MLLR) adaptation. Among them, the major breakthroughs were our robust pitch extraction/tone modeling technology and the treatment of coarticulation across syllable boundaries. For the development set, we dramatically reduced last year’s best error rates by relative 44.8%~67.8% on all three categories we participated. Moreover, for the evaluation set, we achieved the lowest unit error rates on all three categories.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of the philips 1999 taiwan Mandarin benchmark system

This paper describes the Philips Large Vocabulary Continuous Mandarin speech recognition system for the 1999 Taiwan benchmark. The basic system architecture is based on the Philips LVCSR technology developed for Western languages. However, several modifications are made in order to better suitted processing Chinese spoken languages. In the paper, we present some experimental results on the two ...

متن کامل

MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database

Mandarin speech data Across Taiwan (MAT) is a project initiated by members of the Association for Computational Linguistics and Chinese Language Processing (ACLCLP) to collect speech data through public telephone networks in Taiwan. Totally over 7000 Taiwanese individuals have provided speech data. The results were released as a series of MAT speech databases to the research community in Taiwan...

متن کامل

Biopharmaceutical Innovation System and the Influence of Policies: The Case of Taiwan (2000-2008)

This article discusses the influence of policies on the development of biopharmaceuticals. We choose the experiences of Taiwan for our empirical study and focus on the evolution between 2000 and 2008; in the period of time the country provides an interesting example for further exploration of biopharmaceutical policies. Among all the policies, the two National Programs (National Research Progra...

متن کامل

On the Argument Structures of the Transitive Verb 'annoy; be annoyed; bother to do': A Study Based on Two Comparable Corpora

This paper investigates the transitive uses of the verb fan „annoy; be annoyed; bother to do‟, which exhibit both similarities and disparities between Beijing Mandarin and Taiwan Mandarin, as far as the data from Gigaword corpus, containing data from Mainland China (XIN) and Taiwan (CNA), are concerned. In terms of similarities, the causative (and agentive) use(s) of the transitive fan is/are s...

متن کامل

Retrieval of broadcast news speech in Mandarin Chinese collected in Taiwan using syllable-level statistical characteristics

Spoken document retrieval has been extensively studied in recent years because of its high potential in various applications in the near future. Considering the monosyllabic structure of Chinese language, a whole class of indexing features for retrieval of spoken documents in Mandarin Chinese using syllable-level statistical characteristics has been studied, and very encouraging experimental re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000